Introduction to a Proofreading Tool for Chinese Spelling Check Task of SIGHAN-8
نویسندگان
چکیده
The detection and correction of erroneous Chinese characters is an important problem in many applications. This paper proposed an automatic method for correcting erroneous Chinese characters. The method is divided into two parts, which separately handle two types of erroneous character: the occurrence of an erroneous character in a word length of one, and the occurrence in a word length of two or more. The first primarily makes use of a rulesbased method, while the second integrates parameters of similarity and syntax rationality using a linear regression model to predict erroneous characters. Experimental results shown that the F1 and FPR of the proposed method are 0.34 and 0.18 respectively.
منابع مشابه
Introduction to SIGHAN 2015 Bake-off for Chinese Spelling Check
This paper introduces the SIGHAN 2015 Bake-off for Chinese Spelling Check, including task description, data preparation, performance metrics, and evaluation results. The competition reveals current state-of-the-art NLP techniques in dealing with Chinese spelling checking. All data sets with gold standards and evaluation tool used in this bake-off are publicly available for future research.
متن کاملChinese Spelling Check Evaluation at SIGHAN Bake-off 2013
This paper introduces an overview of Chinese Spelling Check task at SIGHAN Bake-off 2013. We describe all aspects of the task for Chinese spelling check, consisting of task description, data preparation, performance metrics, and evaluation results. This bake-off contains two subtasks, i.e., error detection and error correction. We evaluate the systems that can automatically point out the spelli...
متن کاملOverview of SIGHAN 2014 Bake-off for Chinese Spelling Check
This paper introduces a Chinese Spelling Check campaign organized for the SIGHAN 2014 bake-off, including task description, data preparation, performance metrics, and evaluation results based on essays written by Chinese as a foreign language learners. The hope is that such evaluations can produce more advanced Chinese spelling check techniques.
متن کاملHANSpeller: A Unified Framework for Chinese Spelling Correction
Increased interest in China from foreigners has led to a corresponding interest in the study of Chinese. However, the learning of Chinese by non-native speakers will encounter many difficulties, Chinese spelling check techniques for Chinese as a Foreign Language(CFL) learners is highly desirable. This paper presents our work on the SIGHAN-2015 Chinese Spelling Check task. The task focuses on sp...
متن کاملNTOU Chinese Spelling Check System in Sighan-8 Bake-off
This paper describes details of NTOU Chinese spelling check system in SIGHAN-8 Bakeoff. Besides the basic architecture of the previous system participating in last two CSC tasks, three new preference rules were proposed to deal with Simplified Chinese characters, variants, sentence-final particles, and DE-particles. A new sentence likelihood function was proposed based on frequencies of space-r...
متن کامل